Skip to content

Add RSE configuration guide for operators#710

Open
Soap2G wants to merge 7 commits intorucio:mainfrom
Soap2G:gguerrie-rse-docs
Open

Add RSE configuration guide for operators#710
Soap2G wants to merge 7 commits intorucio:mainfrom
Soap2G:gguerrie-rse-docs

Conversation

@Soap2G
Copy link
Contributor

@Soap2G Soap2G commented Jan 9, 2026

Closes #709

Documentation on setting up and configuring Rucio Storage Elements (RSEs) from an operator's perspective. Includes:

  • Overview of RSE types (POSIX, WebDAV, Disk, Tape) (removed in f80b901)
  • Two setup methods: CLI and Python API with side-by-side examples
  • Configuration examples for each RSE type
  • WebDAV setup with Apache configuration and davs protocol (removed in 040ee73)
  • EOS disk RSE with https and root protocols
  • CTA tape RSE configuration with staging timeouts
  • RSE attributes, protocols, and account limits reference
  • Best practices and common pitfalls
  • Quick reference commands

The examples use the latest rucio CLI commands

@Soap2G Soap2G self-assigned this Jan 9, 2026
@voetberg
Copy link
Contributor

voetberg commented Jan 9, 2026

Overall all these changes are really good!!

Only thing I didn't comment on directly in the body of the review is that we might want to mention how rse specific limits work vs account only limits

@Soap2G
Copy link
Contributor Author

Soap2G commented Jan 12, 2026

@voetberg Added a few fixes in 317c0db.

Also, I stumbled upon https://rucio.github.io/documentation/operator/configuration/#creating-new-rses; should I add a link to there pointing to this page?

@voetberg
Copy link
Contributor

@voetberg Added a few fixes in 317c0db.

Also, I stumbled upon https://rucio.github.io/documentation/operator/configuration/#creating-new-rses; should I add a link to there pointing to this page?

I would say this PR completely supersedes that page, and I would link to this page and drastically reduce what's on the config params page. (Maybe just summarizing into a TL;DR with "add rse", "add rse attribute", "add rse protocol", "add account limit")

If you would like, I can take that over and you can just link your page there with something like "An in-depth guide to configuring RSEs can be found here"

@Soap2G
Copy link
Contributor Author

Soap2G commented Jan 12, 2026

@voetberg Added a few fixes in 317c0db.
Also, I stumbled upon https://rucio.github.io/documentation/operator/configuration/#creating-new-rses; should I add a link to there pointing to this page?

I would say this PR completely supersedes that page, and I would link to this page and drastically reduce what's on the config params page. (Maybe just summarizing into a TL;DR with "add rse", "add rse attribute", "add rse protocol", "add account limit")

If you would like, I can take that over and you can just link your page there with something like "An in-depth guide to configuring RSEs can be found here"

Cool, then I'll let you take care of the summary, while I'll just link this page in there. As soon as I have some info about istape, I'll finish up this. Thanks!!

@panta-123
Copy link
Contributor

panta-123 commented Jan 12, 2026

@Soap2G , there seem to be existing section in doc title "Creating new RSEs"
https://rucio.cern.ch/documentation/operator/configuration#creating-new-rses

Quota stuff is also discussed in here: https://rucio.cern.ch/documentation/operator/configuration/#setting-quota-and-permissions

We should consolidate the two and have a single place to have these information.

So I would think:
We put all these info into https://rucio.cern.ch/documentation/operator/configuration/ or put link to the new docs file into there.

@Soap2G
Copy link
Contributor Author

Soap2G commented Jan 14, 2026

@Soap2G , there seem to be existing section in doc title "Creating new RSEs" https://rucio.cern.ch/documentation/operator/configuration#creating-new-rses

Quota stuff is also discussed in here: https://rucio.cern.ch/documentation/operator/configuration/#setting-quota-and-permissions

We should consolidate the two and have a single place to have these information.

So I would think: We put all these info into https://rucio.cern.ch/documentation/operator/configuration/ or put link to the new docs file into there.

See this comment; Maggie will take care of that once the page is up.

@Soap2G
Copy link
Contributor Author

Soap2G commented Jan 16, 2026

@voetberg @panta-123 We should be ready to go, with the reminder of merging the redundant RSE-related pages after this is done.

@voetberg
Copy link
Contributor

@Soap2G Please rebase to grab the pre-commit ci!

Soap2G and others added 4 commits January 19, 2026 13:40
Documentation on setting up and configuring Rucio Storage
Elements (RSEs) from an operator's perspective. Includes:

- Overview of RSE types (POSIX, WebDAV, Disk, Tape)
- Two setup methods: CLI and Python API with side-by-side examples
- Configuration examples for each RSE type
- WebDAV setup with Apache configuration and davs protocol
- EOS disk RSE with https and root protocols
- CTA tape RSE configuration with staging timeouts
- RSE attributes, protocols, and account limits reference
- Best practices and common pitfalls
- Quick reference commands

The examples use the latest `rucio` CLI commands

Co-authored-by: Nikita Avdeev <naavdeev.astro@gmail.com>
Co-authored-by: Luis Antonio Obis Aparicio <luis.antonio.obis@gmail.com>
…nstead of core

Used RSEClient class for rse operations, and AccountLimitClient for account.
Additionally added a paragraph about configuration concepts.
Removed istape from RSE config guide, as it's not needed by Rucio
and it can be replaced by rse_type.

Additionally, added a clearer description of istape in the attributes page.
@Soap2G Soap2G force-pushed the gguerrie-rse-docs branch from d46af70 to f80b901 Compare January 19, 2026 12:41
@Soap2G
Copy link
Contributor Author

Soap2G commented Jan 19, 2026

@Soap2G Please rebase to grab the pre-commit ci!

Hey @voetberg, done 😁

Copy link
Contributor

@voetberg voetberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - nothing seems obviously wrong and if people want to add more examples we can do that in the future

voetberg
voetberg previously approved these changes Jan 20, 2026
@voetberg
Copy link
Contributor

@panta-123 Do you want to give this a read-over and review?

voetberg
voetberg previously approved these changes Jan 28, 2026
This commit consolidates multiple documentation corrections and clarifications:

- Corrected RSE settings vs attributes distinction
   - Fixed TypedDict field listings to match actual implementation
   - Removed non-existent protocol priority fields (priority_lan, priority_wan)
   - Clarified that lfn2pfn_algorithm is an RSE attribute, not a creation parameter
   - Restored geographic fields (city, country_name, latitude, longitude, region_code, time_zone) as valid RSE settings that can be set via gateway API

- Clarified deterministic vs non-deterministic RSE behavior
   - lfn2pfn_algorithm: for deterministic RSEs (disk), computes paths from scope+name only
   - naming_convention: for non-deterministic RSEs (tape), uses metadata/timestamps
   - Added detailed comparison table explaining the differences
   - Updated all examples to reflect correct usage patterns

- Fixed Python API examples**
   - Removed incorrect lfn2pfn_algorithm parameter from add_rse() calls
   - Shows correct attribute-based configuration via add_rse_attribute()
   - Updated workflow examples to match actual client implementation

- CLI fixes
   - Minor fixes to commands structure
@Soap2G
Copy link
Contributor Author

Soap2G commented Feb 3, 2026

Thanks to @Geogouz for the AI-assisted review. It was very useful to spot some inconsistencies in the text.
It also helped in cross checking the CLI commands structure (the suggestions sometimes mixed up legacy and new commands, but overall was useful).

I've also slightly updated configuration_parameters to reflect some clarifications needed in the RSE page.
Most of the content is about RSE settings / attributes, and lfn2pfn algos vs non deterministic RSEs.

Reviews are welcome @voetberg @panta-123

@Geogouz
Copy link
Contributor

Geogouz commented Feb 3, 2026

Thanks to @Geogouz for the AI-assisted review. It was very useful to spot some inconsistencies in the text. It also helped in cross checking the CLI commands structure (the suggestions sometimes mixed up legacy and new commands, but overall was useful).

I've also slightly updated configuration_parameters to reflect some clarifications needed in the RSE page. Most of the content is about RSE settings / attributes, and lfn2pfn algos vs non deterministic RSEs.

Reviews are welcome @voetberg @panta-123

At Rucio's service upon request :D. For reference, here is what the output was in case others would like to use it too in the future. May not be perfect, but I would say it clearly does more good than harm to have it as an additional review opinion:

1) “RSE Attributes” section mixes up settings vs attributes

Inaccurate / misleading in the guide

  • Treating these as “RSE attributes” you set with rucio rse attribute add / add_rse_attribute(...):
    • rse_type
    • verify_checksum

What’s correct

  • rse_type is an RSE setting/property stored on the RSE record (not an attribute). Update it via rucio rse update / RSEClient.update_rse(...) (or set it at creation time).
  • verify_checksum is an RSE setting/property (not an attribute). Update it via rucio rse update / RSEClient.update_rse(...) and pass a boolean.
  • lfn2pfn_algorithm is documented as an RSE attribute (and is surfaced in the RSE “settings” output). It’s typically set via rucio rse attribute add / RSEClient.add_rse_attribute(...) and is effectively immutable afterwards.

Why this matters: setting rse_type / verify_checksum via the attribute path won’t change the RSE settings you think it changes, so the deployment won’t behave as described.


2) The “Mandatory attributes” danger box is incorrect

Inaccurate

  • “Mandatory attributes: rse_type, fts, lfn2pfn_algorithm

What’s correct

  • rse_type:
    • Not an attribute; it’s an RSE setting.
    • Defaults to DISK if not set.
  • fts:
    • Is an RSE attribute, but it’s only needed if your deployment uses FTS for transfers (e.g., third‑party copy via FTS).
  • lfn2pfn_algorithm:
    • Is an RSE attribute, but it’s not mandatory because there is a policy default when none is set.
    • It’s also effectively immutable after creation, so treat it as a creation-time decision.

3) Protocol priority rules in the guide are wrong / misleading

Inaccurate

  • “Higher numbers indicate higher priority”

Misleading / needs tightening

  • “Priority 0 or omitted disables the protocol”

What’s correct directionally

  • Priority ordering is not “bigger number = more preferred”. The transfer tooling considers protocols “ordered by priority”, and examples/documentation consistently use 1 for enabled operations and 0 to disable an operation.
  • Use 0 to disable an operation, and use a positive integer (commonly 1) to enable it. If an operation key is omitted, treat it as “not supported”.

4) CLI examples don’t match the current rucio (click) CLI, and the guide mixes rucio vs rucio-admin

Inaccurate / misleading in the guide

  • Using positional RSE arguments where the click-based rucio CLI expects options, e.g.:
    • rucio rse add RSE_NAME
    • rucio rse protocol add ... RSE_NAME
    • rucio rse distance add --distance 1 SOURCE_RSE DEST_RSE
    • rucio account limit add account_name --rse RSE_NAME --bytes quota
  • Treating the legacy rucio-admin syntax and the click-based rucio syntax as interchangeable.

What’s correct directionally

  • In the documented click-based rucio CLI, RSEs are passed via --rse / --rses, distance endpoints via --source/--destination, and protocol host via --host, e.g.:
    • rucio rse add --rse XRD1
    • rucio rse protocol add --host xrd1 --rse XRD1 ...
    • rucio rse distance add --source XRD1 --destination XRD2 --distance 1
    • rucio account limit add --account root --rses XRD1 --bytes infinity
  • rucio-admin is a different (legacy) CLI with different subcommand names and flags (e.g. rucio-admin rse add-protocol --hostname ...). If the guide wants to support both, it must explicitly split “rucio (click)” vs “rucio-admin (legacy)” examples.

5) RSE inspection / “Quick reference” command names are wrong (or at least not the ones documented)

Inaccurate / misleading

  • rucio rse info RSE_NAME
  • rucio rse protocol list RSE_NAME
  • rucio rse usage RSE_NAME

What’s correct directionally

  • For the click-based client, rucio rse show is the documented command for inspecting an RSE.
  • For usage specifically, the documented client command is rucio list-rse-usage RSE_NAME.

6) Python API examples: wrong method for settings + missing required arguments

Inaccurate

  • Setting rse_type via add_rse_attribute('RSE', 'rse_type', ...).
  • Setting verify_checksum via add_rse_attribute(...).
  • Calling AccountLimitClient.set_account_limit(account, rse, bytes_) without a locality.
  • Passing boolean-ish values as strings (e.g. 'False') instead of actual booleans.

What’s correct

  • Set RSE settings via:
    • RSEClient.add_rse('RSE', rse_type='TAPE', ...), or
    • RSEClient.update_rse('RSE', {'rse_type': 'TAPE', 'verify_checksum': False, ...})
  • Use add_rse_attribute for attributes like lfn2pfn_algorithm, fts, archive_timeout, greedyDeletion, etc.
  • Account limits:
    • set_account_limit(account, rse, bytes_, locality) where locality is 'local' or 'global', or use the convenience methods set_local_account_limit(...) / set_global_account_limit(...).

7) Quota sizes: the guide’s TB/PB byte counts are TiB/PiB, not what Rucio parses as “TB/PB”

Inaccurate

  • 1 TB = 1099511627776 bytes
  • 1 PB = 1125899906842624 bytes

What’s correct

  • Rucio’s get_bytes_value_from_string uses decimal multipliers:
    • TB10**12
    • PB10**15
  • Therefore:
    • 1 TB = 1000000000000 bytes
    • 1 PB = 1000000000000000 bytes
  • If you want binary units, label them as TiB/PiB or provide explicit byte counts.

8) Tape-specific: archive_timeout is not “file staging (stage-in)”

Inaccurate

  • archive_timeout ... maximum time for file staging”

What’s correct

  • archive_timeout is used for transfers with a tape destination to control how long the FTS transfer manager waits for archival completion (terminal FAILED/FINISHED states). It does not control stage-in / bring-online.

9) Example attributes likely deployment-specific / outdated (or at least not part of the documented core set)

Potentially misleading

  • Presenting backend_type and storage_usage_tool as core, required configuration for a POSIX RSE.

Safer / correct framing

  • These keys are not part of the documented RSE settings/attributes list in current upstream docs.
  • If you keep them, document them explicitly as deployment-specific (i.e., only meaningful if your site runs custom integrations which read them).

10) Minor wording-level inaccuracies (optional to fix, but improves correctness)

  • “POSIX RSEs cannot be accessed from remote machines”
    • More accurate: file:///POSIX access works from any machine which can see the filesystem path (e.g., via a shared mount). It’s not inherently “single-node”, it’s “same filesystem namespace”.

11) Non-deterministic RSEs: naming_convention is not the PFN naming algorithm

Inaccurate / misleading

  • Implying that setting naming_convention defines how PFNs are generated for a non-deterministic RSE.

What’s correct

  • Non-deterministic behaviour is controlled by the RSE’s deterministic flag (e.g. creating the RSE as non-deterministic).
  • naming_convention is documented as a policy algorithm used to validate DIDs on an RSE; it is not the LFN→PFN mapping algorithm (that role belongs to lfn2pfn_algorithm for deterministic RSEs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add RSE operator docs

5 participants